Excluded 10 participants for responding randomly, missing at least one out of the four experiments, or otherwise not complying with task instructions. This leaves us with 47 high verbal and 46 low verbal. All the plots visualize categorical differences between the two groups while all the statistical models use verbal score as a continuous predictor.

Same/different judgments

Prior to this excluded trials above 5 seconds and below 200 ms.

Descriptive statistics by group: Same/different judgments

Generally, participants made the correct judgment on 95.32 % of trials. This did not differ between the high verbal (95.58 %) and the low verbal group (95.48. In subsequent analyses and plots, we only include correct trials. See Figure XX below for reaction times between the high verbal and low verbal groups for category (‘do these two animals belong to the same category?’) or identity (‘are these two animals identical?’) judgments.

Statistical models: Same/different judgments

We conducted a linear mixed model of verbal score and judgment type predicting log-transformed reaction time including random intercepts per participant. This model indicated significant main effect of judgment type and a marginally significant effect of verbal score. Identity judgments were faster than category judgments (\(\beta\) = -0.19, SE = 0, t = -40.95, p < .001), and a higher verbal score was marginally associated with faster reaction times (\(\beta\) = -0.03, SE = 0.02, t = -1.87, p = 0.065).

The key test for this experiment was whether the two groups behaved differently when giving correct ‘DIFFERENT’ responses on identity trials when the two images belonged to the same category. That is, we expected high verbal participants to be more susceptible to interference from a same-category distractor.

ggplot(SD_rt_df_key_comparison, aes(same_category_animal, rt, color=high_low_verbal)) +
  geom_sina(data= SD_rt_df_key_comparison_individual, aes(same_category_animal, rt), alpha=0.3)+
  geom_errorbar(aes(ymin=rt-ci, ymax=rt+ci), width=.1, position= pd) +
  stat_summary(fun = mean, geom = 'point', aes(group = high_low_verbal), position= pd) +
  stat_summary(fun = mean, geom = 'line', aes(group = high_low_verbal), size = 1, position= pd)+
  theme_bw() +
  labs(y ='RT', title = 'Latency to correct DIFFERENT response on identity trials', x = 'Between or within category distractor') +
  scale_color_manual(values = color_palette[c(4,6)])

A linear mixed model of log-transformed reaction time with verbal score and category membership of the distractor as predictors, including random intercepts per participant, provided evidence that high verbal participants were not particularly affected by the within-category interference (interaction effect: p = 0.97). However, there was a significant main effect of category membership of the distractor with within-category distractors being associated with slower reaction times (\(\beta\) = 0.08, SE = 0.03, t = 2.95, p = 0.003).

Additional analyses: Same/different judgments

We also checked whether the kind of animal made a difference on a within-category distractor trial.

A linear mixed model of log-transformed reaction times with verbal score and animal pair (dog-dog or cat-cat) as predictors, including random intercepts per participant, provided evidence that dog-dog trials were faster than cat-cat trials (\(\beta\) = -0.11, SE = 0.03, t = -4.23, p < .001). The model corroborated the result that a higher verbal score was associated with faster reaction times (\(\beta\) = -0.05, SE = 0.02, t = -2.31, p = 0.023). However, this effect of verbal score was less strong when the stimuli were dog-dog than when they were cat-cat as indicated by a significant interaction effect between verbal score and animal pair (\(\beta\) = 0.02, SE = 0.01, t = 3.16, p = 0.002).

Strategies: Same/different judgments

In this experiment, most participants said that they had no particular strategy. However, eight of the high-verbal participants and one of the low-verbal participants explicitly mentioned something to do with verbalizing the problems (e.g. ‘In my head I said “same” or “different” before I pressed the arrow key.’)

Rhyme judgments

Excluded five rhyming pairs as they had below-chance performance for at least one group. These were bin/chin, cab/crab, rake/cake, wave/cave, and park/shark.

Descriptive statistics by group: Rhyme judgments

Here is a table of accuracy and reaction time for the two groups (high and low verbal) across types of rhyming trials.

high_low_verbal type rt ci_rt correct ci_accuracy
high_verbal non-ortho 1852.66 51.47 82.77 2.86
high_verbal NR 1930.79 53.26 97.52 1.36
high_verbal ortho 1719.41 54.99 91.21 2.48
low_verbal non-ortho 1970.28 53.85 76.20 3.21
low_verbal NR 2024.48 60.47 93.84 1.87
low_verbal ortho 1858.94 60.38 83.62 3.22

As can be seen in this table, high verbal participants were generally both faster and more accurate than low verbal participants on all three types of trials. See also figures below.

Statistical models: Rhyme judgments

A model of verbal score, rhyme type, and name agreement for the first image predicting log-transformed reaction time showed no main effect of verbal score (\(\beta\) = -0.01, SE = 0.02, t = -0.64, p = 0.525), but it did find a marginally significant effect of no-rhyme type being slower than non-orthographic rhyme (\(\beta\) = 0.06, SE = 0.03, t = 1.82, p = 0.069) and a significant effect of name agremeent being associated with faster reaction times (\(\beta\) = -0.24, SE = 0.03, t = -7.3, p < .001). There were no significant interactions between rhyme type and verbal score. Another model of verbal score, rhyme type, and name agreement for the first image predicting accuracy showed that no-rhyme trials were easier than non-orthographic trials (\(\beta\) = 1.19, SE = 0.39, z = 3.06, p = 0.002) and that a higher verbal score was associated with a higher likelihood of responding accurately (\(\beta\) = 0.19, SE = 0.08, z = 2.35, p = 0.019). It also showed that trials with images with higher name agreement were significantly easier (\(\beta\) = 0.95, SE = 0.26, z = 3.69, p < .001). There were no significant interactions between rhyme type and verbal score.

Strategies: Rhyme judgments

We were interested in whether participants said the words out loud to make the rhyme judgments and so we included this as a question at the end of the rhyming experiment. A chi-squared test showed that there was no significant difference between how many high-verbal participants (23 out of 47) and how many low-verbal participants (21 out of 46) reported that they had said the words out loud (\(\chi^2\)(1) = 0.01, p = 0.913). Nevertheless, the effect of doing so was interestingly different for the two groups as can be seen in the figure below.

For both reaction time and accuracy, saying the words out loud diminished the difference between the two groups. This suggests that this was the strategy that high-verbal participants used in their heads - indeed, this was the most common strategy provided by the participants (from both groups) who chose to answer the free answer about strategy. There were no other notable strategies from the free answers.

Verbal working memory

Participants were tested on recall of three sets of five words. One set contained words that were phonologically similar but not orthographically similar (bought, sort, taut, caught, and wart), one set contained words that were orthographically similar but not phonologically similar (rough, cough, through, dough, bough), and one set was a control set (plea, friend, sleigh, row, board).

Descriptive statistics by group: Verbal working memory

High verbal participants generally remembered more words correctly both when the correct position was required and when the words could be in any position (see table and figure below).

high_low_verbal original_word_set score ci_score score_any_position ci_score_any_position
high_verbal ctrlSet 4.19 0.13 4.51 0.08
high_verbal orthoSet 3.72 0.14 4.18 0.10
high_verbal phonSet 3.43 0.16 4.11 0.10
low_verbal ctrlSet 3.69 0.15 4.17 0.11
low_verbal orthoSet 3.52 0.15 4.10 0.11
low_verbal phonSet 3.02 0.15 3.81 0.11

Statistical models: Verbal working memory

We conducted two linear mixed models of original word set (phonologically similar, orthographically similar, and control set) and verbal score predicting either memory performance with both correct word and correct position or memory performance with correct word regardless of position. Both models included random intercepts for each participant and for each presentation order of the stimuli.

For memory performance requiring both accurate word and position, the set with phonologically similar words was more difficult than the control set (\(\beta\) = -0.62, SE = 0.19, t = -3.18, p = 0.001) but the orthographically similar set was not (\(\beta\) = 0, SE = 0.19, t = 0.02, p = 0.981). A higher verbal score was associated with increased memory performance (\(\beta\) = 0.22, SE = 0.09, t = 2.55, p = 0.012). There was a marginally significant interaction effect (\(\beta\) = -0.09, SE = 0.05, t = -1.82, p = 0.068) which diminished the positive effect of higher verbal score on the orthographically similar set.

The same pattern was found when the correct word in any position counted as correct: The set with phonologically similar words was more difficult than the control set (\(\beta\) = -0.31, SE = 0.14, t = -2.26, p = 0.024) but the orthographically similar set was not (\(\beta\) = 0.11, SE = 0.14, t = 0.79, p = 0.428). A higher verbal score was associated with increased memory performance (\(\beta\) = 0.16, SE = 0.06, t = 2.56, p = 0.012). There was a significant interaction effect (\(\beta\) = -0.09, SE = 0.04, t = -2.45, p = 0.01) which diminished the positive effect of higher verbal score on the orthographically similar set.

Strategies: Verbal working memory

As with the rhyming experiment, we were again interested in whether participants said the words out loud to help them remember them. We asked about this at the end of the experiment. A chi-squared test showed that there was no significant difference between how many high-verbal participants (10 out of 47) and how many low-verbal participants (13 out of 46) reported that they had said the words out loud (\(\chi^2\)(1) = 0.29, p = 0.589). Nevertheless, the effect of doing so was interestingly different for the two groups as can be seen in the figure below.

## Automatically converting the following non-factors to factors: high_low_verbal, talk_out_loud, original_word_set
## Automatically converting the following non-factors to factors: high_low_verbal, talk_out_loud, original_word_set
## Automatically converting the following non-factors to factors: high_low_verbal, worker_id, talk_out_loud, original_word_set
## Automatically converting the following non-factors to factors: high_low_verbal, worker_id, talk_out_loud, original_word_set

The difference between the two groups’ memory performance disappears when they report that they said the words out loud to help them remember. Doing so helps low-verbal participants but makes no difference for high-verbal participants. Participants gave some interesting alternative strategies in response to the free answer question about strategies:

High-verbal group

  • Remembering the order of the first letters once the words were familiar (e.g. c, b, t, r, d for ‘cough’, ‘bough’, ‘through’, ‘rough’, ‘dough’). One participant reported this.
  • Finding a cadence/melody and using this to repeat the words.
  • Chunking.
  • Hand and body gestures.
  • Creating a story or a sentence with the words in order (both visual and verbal). This one was the most common strategy.

Low-verbal group

  • Remembering the order of the first letters once the words were familiar (e.g. c, b, t, r, d for ‘cough’, ‘bough’, ‘through’, ‘rough’, ‘dough’). This strategy was much more common for the low-verbal group than for the high-verbal group.
  • Form a story or a narrative. This was a less common strategy than remembering the first letters.

Task switching

We excluded trials over 10 seconds. We also recalculated the accuracy measure so that any trial in the three switch conditions where participants in fact switched between adding and subtracting counted as correct (as long as the arithmetic itself was also correct). We did this to prevent a failure to switch once resulting in the remaining trials counting as incorrect.

Descriptive statistics: Task switching

As can be seen from the table and the figure below, accuracy was generally quite high in all conditions.

high_low_verbal condition rt ci_rt switching_is_correct ci_accuracy
high_verbal addition 2287.38 47.04 97.94 0.01
high_verbal colorcue 2774.63 61.61 95.64 0.01
high_verbal subtraction 2527.52 53.77 97.65 0.01
high_verbal symbolcue 2564.20 54.44 97.72 0.01
high_verbal uncued 2678.94 59.15 94.59 0.01
low_verbal addition 2312.32 46.34 98.32 0.01
low_verbal colorcue 2781.48 62.98 95.08 0.01
low_verbal subtraction 2572.91 55.19 97.80 0.01
low_verbal symbolcue 2639.81 56.00 96.72 0.01
low_verbal uncued 2709.74 63.84 93.19 0.01

Statistical models: Task switching

To simplify the comparisons, we only compared the symbol cue and the uncued conditions and the color cue and symbol cue conditions. In all models, participants were modeled as random intercepts. Linear mixed models of condition and verbal score predicting accuracy indicated no effect of verbal score (symbol cued versus uncued: \(\beta\) = 0.03, SE = 0.14, z = 0.19, p = 0.852; color cued versus symbol cued: \(\beta\) = 0.15, SE = 0.14, z = 1.07, p = 0.283). There were also no interaction effects (both p > 0.364), but uncued trials were less likely to be accurate than symbol cued trials (\(\beta\) = -1.24, SE = 0.44, z = -2.82, p = 0.005).

As for log-transformed reaction time, there were also no effect of verbal score and no interaction effects (all p > 0.244). However, symbol cued trials were marginally faster than color cued trials (\(\beta\) = -0.05, SE = 0.03, t = -1.84, p = 0.066).

Strategies: Task switching

We once again examined differences associated with talking out loud, despite the fact that there were no general differences in performance between the two groups. A chi-squared test showed that there was no significant difference between how many high-verbal participants (20 out of 47) and how many low-verbal participants (13 out of 46) reported that they had talked to themselves out loud during the task (\(\chi^2\)(1) = 1.5, p = 0.221). There were not any obvious differences between the effects that talking out loud had on these two groups (see accuracy and reaction time plots below).

In response to the free answer question in the task switching experiment, several of the high-verbal participants said that they had said the answers out loud to themselves but not the operation (‘add’, ‘subtract’). One visualized a cartoon character wearing red and giving thumbs-up or wearing blue and giving thumbs-down, one used their own thumb to keep track, and one used their fingers to count. Participants from the low-verbal group did not report many specific strategies apart from a few saying the operation or result out loud - one reported that they had tapped their index finger to mean ‘add’ and their middle finger to mean ‘subtract’.

Intertask correlations

We were interested in how performance on the different tasks correlated with each other and whether these correlations were different for the two groups.

Overall intertask correlations

Colored squares are significant at p < .01. Generally, different performance measures correlate within the same experiment. Interestingly, reaction times on rhyming are negatively correlated with verbal working memory score suggesting some working memory involvement in the rhyming task. Accuracy on uncued switch trials in the task switching experiment is also positively correlated with accuracy on verbal working memory and rhyming.

Intertask correlations for the high-verbal group

Colored squares are significant at p < .01. For high-verbal participants, reaction times on rhyming, category judgments, and task switching are positively correlated, suggesting that these rely on similar mechanisms for this group.

Intertask correlations for the low-verbal group

Colored squares are significant at p < .01. For low-verbal participants, there is no such widespread correlation between rhyming, category judgments, and task switching reaction times. However, they show a positive correlation between accuracy on uncued switch trials and (position-indifferent) verbal working memory and rhyming accuracy.

Difference between task correlations in low-verbal group and high-verbal group

Colored squares are significant at p < .01.

Summary of behavioral findings

Participants who scored lower on verbal representations were slower at making category-based judgments in the same-different experiment. We expected high-verbal participants to show a more marked effect of being distracted by category membership on identity judgment trials (e.g. being slower at correctly responding ‘different’ to two pictures of different cats). We did not see this in the data.

In the rhyming experiment, high-verbal participants were significantly better than low-verbal participants. The differences in both accuracy and response time between the two groups was eliminated when participants reported naming the pictures out loud.

In the phonological similarity experiment, the high-verbal group performed better at both position-specific recall and position-indifferent recall. Once again, the differences between the two groups was diminished if participants reported talking out loud to remember the words.

There were no notable differences between the two groups in the task switching experiment.

Questionnaire measures

For some strange reason, we do not have questionnaire data from A3KVKK1XLBTSN3. We will retain their data from the four behavioral experiments and here report questionnaire data from 47 high-verbal and 45 low-verbal participants.

Here is a plot of all our custom questions. Dark blue represents the high-verbal group and dark yellow represents the low-verbal group.

Questionnaire answers..